Speaker identification and verification using eigenvoices

نویسندگان

Olivier Thyes

Roland Kuhn

Patrick Nguyen

Jean-Claude Junqua

چکیده

Gaussian Mixture Models (GMMs) have been successfully applied to the tasks of speaker ID and verification when a large amount of enrolment data is available to characterize client speakers ([1],[10],[11]). However, there are many applications where it is unreasonable to expect clients to spend this much time training the system. Thus, we have been exploring the performance of various methods when only a sparse amount of enrolment data is available. Under such conditions, the performance of GMMs deteriorates drastically. A possible solution is the “eigenvoice” approach, in which client and test speaker models are confined to a low-dimensional linear subspace obtained previously from a different set of training data. One advantage of the approach is that it does away with the need for impostor models for speaker verification. After giving a detailed description of the eigenvoice approach, the paper compares the performance of conventional GMMs on speaker ID and verification with that of GMMs obtained by means of the eigenvoice approach. Experimental results are presented to show that conventional GMMs perform better if there are abundant enrolment data, while eigenvoice GMMs perform better if enrolment data are sparse. The paper also gives experimental results for the case where the eigenspace is trained on one database (TIMIT), but client enrolment and testing involve another (YOHO). For this case, we show that performance improves if an environment adaptation technique is applied to the eigenspace. Finally, we discuss priorities for future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid speaker adaptation for continuous speech recognition using merging eigenvoices

Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. To improve the performance of the method and to obtain stabilized results, the number of speaker-dependent models should be increased and a greater number of eigenvoices should be re-estimated. However, the huge computation time required to find eigenvoices makes these solutions difficult, especially in a c...

متن کامل

A comparative study of adaptation methods for speaker verification

Real-life speaker verification systems are often implemented using client model adaptation methods, since the amount of data available for each client is often too low to consider plain Maximum Likelihood methods. While the Bayesian Maximum A Posteriori (MAP) adaptation method is commonly used in speaker verification, other methods have proven to be successful in related domains such as speech ...

متن کامل

Minimum classification error/eigenvoices training for speaker identification

This paper describes a new training approach based on two different techniques (Minimum Classification Error and eigenvoices) in order to achieve a better robustness when only poor training data is provided. In the first two sections of this paper we describe the MCE training and the eigenvoice approach. Then a unified MCE/eigenvoice training algorithm is proposed describing theoretical advanta...

متن کامل

Using genetic algorithms for rapid speaker adaptation

This paper proposes two new approaches to rapid speaker adaptation of acoustic models by using genetic algorithms. Whereas conventional speaker adaptation techniques yield adapted models which represent local optimum solutions, genetic algorithms are capable to provide multiple optimal solutions, thereby delivering potentially more robust adapted models. We have investigated two different strat...

متن کامل

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.

متن کامل